Modeling Prosody Pattern of Chinese Expressive Speech and Its Application in Personalized Speech Conversion
نویسندگان
چکیده
This paper proposes an approach for modeling prosody patterns of acoustic features of Chinese expressive speech. In a Chinese multi-syllabic prosodic word, a syllable is identified as the core syllable based on the observation that speaker usually puts more emphasis on such syllable. The variations of the acoustic features migrating from neutral to expressive speech are then analyzed for both the core and non-core syllables. It is found that the acoustic variations of the core syllable are the most significant; the variations of the non-core syllables are influenced by the core syllable; such influence decreases while the non-core syllable moves farther from the core syllable. A double-layer perturbation model is then proposed to model such prosody patterns, which is further applied to generate personalized prosody patterns for personalized speech conversion. Experimental results indicate that our model can catch and regenerate the personality of prosodic features in expressive speech.
منابع مشابه
Study on Unit-Selection and Statistical Parametric Speech Synthesis Techniques
One of the interesting topics on multimedia domain is concerned with empowering computer in order to speech production. Speech synthesis is granting human abilities to the computer for speech production. Data-based approach and process-based approach are the two main approaches on speech synthesis. Each approach has its varied challenges. Unit-selection speech synthesis and statistical parametr...
متن کاملModeling of Fundamental Frequency Contour of Thai Expressive Speech using Fujisaki’s Model and Structural Model
Problem statement: In spontaneous speech communication, prosody is an important factor that must be taken into account, since the prosody effects on not only the naturalness but also the intelligibility of speech. Focusing on synthesis of Thai expressive speech, a number of systems has been developed for years. However, the expressive speech with various speaking styles has not been accomplishe...
متن کاملStructural Modeling of Fundamental Frequency Contour for Thai Expressive Speech
Problem statement: Appropriate modeling of fundamental Frequency (F0) contour for speech is a key factor to preserve the quality of speech prosody. One successful approach has been conducted for tonal language of Mandarin Chinese. It is based on the assumption that the behavioral characteristics of vocal-fold elongation in vibration could be approximated by those of a simple forced vibrating sy...
متن کاملEmotion conversion using Feedforward Neural Networks
An emotion is made of several components such as physiological changes in the body, subjective feelings, and expressive behaviours. These changes in speech signal are mainly observed in prosody parameters such as pitch, duration and energy. In this work, prosody parameters are modified using instants of significant excitation (epochs) and these instants are detected using Zero Frequency Filteri...
متن کاملProsody generation in Chinese synthesis using the template of quantified prosodic unit and base intonation contour
This paper presents a prosody generation method for Chinese mandarin using the template of quantified prosodic unit and base intonation contour. This method uses the prosodic feature picked-up from the syllables in the prosody words by rule as the base unit, and integrates the prosody rules in the prosody words of Chinese mandarin and base intonation contour to achieve the prosody contours with...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2012